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PROCESS FOR THE IDENTIFICATION OF A DOCUMENT 

The present invention relates to a process for the 
automatic identification of a document in a computer as well 
as a process for automatically classifying and filing edited 
documents in a computer using the identification process as 
well as ^- ts device for practicing the same. 

Generally speaking, when a document is printed from 
a computer such as a letter, a report, a text for example, it 
is often necessary to copy this document to subsequently 
classify it into different classifications permitting retriev- 
ing it by various search routes. For example, a classifica- 
tion can be carried out by a client file, by the file of the 
object with which the document deals, or else by a chronologi- 
cal classification of the original documents. Such a filing 
technique takes up space and time. 

There are already known filing processes available 
in computers associated with a process for searching filed 
documents. Thus, when a document has been prepared such as a 
letter and it is decided to file this document, recourse is 
had to a system of classification of the document in which the 
classification parameters are defined, such as for example the 
type of document, the person, an event, etc. Once these 
parameters are chosen, one or several classifications are 



used, a classification containing the documents having a 
common characteristic. During .search, one or several of these 
search criteria are entered, permitting accessing said 
document. There is thus a personal archive of the documents. 
5 It has also been proposed to conduct automatic 

archiving of the edited documents in which the nature of the 
document is recognized by the printer. This recognition is 
carried out for example for the first document edited in a 
stream of documents, this first document being recognized as 
10 having been classified in a classification according to the 

particular characteristics that it possesses, the rest of the 
edited documents will also be memorized in the same classifi- 
cation. 

Such a process requires a substantial learning phase 
15 to the extent that it is necessary to define first of all the 

identification criteria of a type of document for its recogni- 
tion and its filing in a data base. Such a learning process 
is difficult to practice and is not within the skill of an 
ordinary user. 

20 There is also known from WO-A- 970253 6 , a process for 

filing documents in which an acquisition and reading circuit 
controls the memory of a microcomputer by means of the last 
signals output from an operating printer, which produce the 
presentation of the printed material and file in the 

2 5 microcomputer an identical binary reproduction of the document 

to be printed, for example on a non-erasable data support such 
as a CD ROM. However, such a process, if it copies all the 
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documents edited by a printer, does not permit classified 
filing of the documents permitting their retrieval as a 
function of specific identification criteria. 

A major drawback of these filing processes consists 
5 in that it is difficult to be able to identify automatically 

and easily a document so as to classify it and file it for 
example. Thus, if one identification criterion is, for 
example, the presence of a term in the document, all the 
documents including this term will be identified, which is not 

1Q^ of inte^st for classification. 

To this end, the invention has first of all as its 
object a process for the automatic identification of a 
document in a computer, characterized in that the data 
contained in the document are analyzed according to their 

15 content and/or their position in the document and they are 

compared one by one to one or more of the document identifica- 
tion criteria, an identification criterion being defined by 
the content and/or the position of a characteristic datum of 
a document . 

20 Thus, such a process preferably permits identifying 

a document without ambiguity. Thus, the positive comparison 
between one datum of a document and an identification criteri- 
on of the type of document permits identifying the analyzed 
document as being of the same type as that defined by the 

25 identification criterion because it has a datum whose content 

and/or position is identical to the identification criterion 
of the type of document . 
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Thus, an identification criterion of an invoice 
could be the term "invoice" at a distance of 3 cm from the 
left margin and at a distance of 6 cm from the upper edge of 
the invoice. If the analyzed document has a datum whose 
5 content is "invoice" and the position is 3 cm from the left 

margin and 6 cm from the upper margin of the document, the 
comparison between these datum and the identification criteri- 
on of an invoice is positive, the document is identified as 
being an invoice. Another identification criterion could be 

10 the research of references, the document being then identified 

as an invoice having such references. 

Preferably, when the comparison is negative, it is 
proposed to define at least one new identification criterion 
and the parametric window is opened to define at least one new 

15 identification criterion associated with this document, said 

new criterion being memorized to identify thereafter any 
document having this datum. An identification criterion is 
defined by copying a datum of the wording of the document 
serving as an identification criterion toward the parametric 

20 window where this datum is copied, associated with its 

position parameters in the document . 

Thus, the identification process according to the 
invention permits creating progressively identification 
criteria for new documents and any datum whatsoever of a 

25 document can serve to define an identification criterion. 

The present invention also has for its object to 
provide a process for an automatic classification and filing 
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of individual edited documents in a computer in which there 
can be identified, thanks to the above identification process, 
each document sent to a printer to file it and define new 
identification criteria when the document does not contain any 
5 already-known identification criterion. 

To this end, the present invention has for its 
object a process for the automatic classification and filing 
of edited documents from a computer, characterized in that, 
when the document to be edited is sent to a publishing 

10 support, the data contained in the document are analyzed 

according to their content and/or their position in the 
document and are compared one by one to one or more of the 
identification criteria of the document, an identification 
criterion being defined by the content and/or the position of 

15 a characteristic datum of a document; these data of the 

document are also compared to at least one classification 
criterion, each criterion corresponding to a type of document 
to be classified and a classification in the memory of the 
computer in which are memorized the documents having this 

2 0 criterion; and when the comparison is negative, there can be 

defined at least one new identification criterion correspond- 
ing to a type of document and at least one classification 
criterion corresponding to a classification in the memory of 
the computer in which said document is copied; and, when the 

25 comparison is positive, said document is automatically copied 

into the memory of the computer according to the corresponding 
classification or classifications. 
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Thus, preferably, the process according to the 
invention permits automatically filing an edited document from 
a computer toward a publishing support to the extent that, 
each time a document is edited, the latter is automatically 
5 identified because it has one or several identification 

criteria representative of a document, the classification 
criteria permitting determining the number of classifications 
to which this document belongs and which contain documents of 
a same type . 

10 By addition support is meant printers but also 

telecopiers integrated with a computer, the emissions by 
internet or intranet, copying supports, etc. Thus, it is 
somewhat the edition support which recognizes the documents 
sent to it from the computer, copies them, sorts them and 

15 automatically classifies them. 

In this way, a document can be filed according to 
several classifications, which offers multiple search routes, 
the process according to the invention therefore offering the 
possibility of sorting and classifying edited documents. 

2 0 Preferably, the document is copied only once into 

the memory, but as the identification criteria are associated 
with classification criteria, there exists as many possible 
search routes as classification criteria. Thus, the process 
according to the invention offers filing and classification 

25 with a minimum of occupied space. 

But moreover, when the document is identified as 
being unknown, which is to say responding to none of the 
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identification criteria, the process, automatically, permits 
the definition of at least one new identification criterion 
relative to this new document. Moreover, from the new 
identification criterion, it is also possible to define at 
least one classification criterion. Once these criteria are 
defined, the document is filed in a classification adapted to 
contain progressively the documents of the same classification 
criterion and of the same identification criterion whilst the 
identification criterion or criteria as well as the classifi- 
cation criterion or criteria are memorized and will serve for 
the identification of subsequent edited documents and their 
filing . 

Thus, the process according to the invention offers 
the possibility of defining progressively with the emission of 
different documents, new identification criteria and classifi- 
cation criteria characteristic of these documents. 

There results a great flexibility in that it is not 
necessary first to define a document . 

According to a first embodiment, the identification 
criterion and the classification criterion are defined at the 
same time as the document is sent toward a publishing support 
such as for example a printer. This process therefore permits 
individually processing each document arriving at the publish- 
ing support . 

Thus, the document en route toward the publishing 
support is analyzed. When the comparison is negative, the 
user can then define one or more identification criteria to 



recognize thereafter this type of document as well as the 
classification criteria to file it in the different types of 
classification and then to edit this document. The user can 
also choose not to define the identification criteria and the 
document is simply published. 

When the comparison is positive, the recognized 
document is filed and published. 

According to a second embodiment of the process of 
the invention, all the unidentified edited documents from one 
or several computers are stored temporarily in a library where 
a user, preferably an authorized one, can cause the analysis 
of these documents. 

Thus, the analyzed documents are either identified 
and filed or unidentified and, in this case, it is possible to 
define new identification criteria and classification criteria 
or to destroy the documents which one does not desire to 
identify and file. 

This embodiment is preferable, because it permits 
storage of the edited documents from one or several computers 
in a single library which permits greater security by giving 
to each user emitting a document the possibility of destroying 
this document if it is not identified and filed. 

Preferably, each identification criterion is 
specific to a characteristic datum of a document and an 
identification criterion is constituted both by the content of 
the datum belonging to the text of the document and to the 
positioning parameters of this datum in the document. 
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The analysis of the data contained in the document 
to be edited takes place preferably at two levels, on the one 
hand the research of the content of a datum in the text of the 
document and on .the other hand the search for position 
parameters of this datum in the document . 

Upon a negative comparison between the identifica- 
tion criteria already known and a document to be edited, this 
document is recognized as unidentified and hence not fileable. 
A parametric window is thus opened to define at least one new 
identification criterion associated with this document and at 
least one classification criterion, said new criterion being 
memorized to identify thereafter any document responding to 
the same identification criterion and the classification 
criterion controlling its automatic filing in the associated 
classification. This definition is carried out by the copying 
of a datum of the text of the document serving as an identifi- 
cation criterion, toward the parametric window where the data 
is copied associated preferably with its position parameters 
in the document . There is then copied into the memory of the 
computer the document in a classification associated with the 
new identification criterion. 

Thus, the definition of the identification criteria 
takes place in a very simple manner by a simple manipulation 
(copy-slide-glue) which permits this by all the computer 
users, even beginners. 

The identification criterion can be a type of 
criterion in which several turns are admitted. For example, 



at a given place in the document, there can be the term 
"invoice" but one should also be able to identify the term to 
have, debit note, etc., at the same position as identifying a 
document of the same type belonging to a same classification. 
By "term", is meant for example a word, a series of letters or 
numbers or a given number of numerals or of letters no matter 
what their meaning. 

The new identification criteria defined are memo- 
rized and will serve for the identification of a document 
having the same characteristics during subsequent publishing. 

There are also defined the criteria for classifica- 
tion which serve for automatic filing of said document in a 
given classification, these criteria corresponding to the 
presence of a term or a series of terms in a document, for 
example . 

The process according to the invention can be 
associated with one or several publishing supports, preferably 
at the choice of the user, who then selects not only a 
publishing support but also selects the association of this 
publishing support to the process of automatic classification 
and filing according to the invention. 

The present invention also has for its object a 
device for practicing the process according to the invention, 
characterized in that it comprises means for' analyzing the 
data contained in a document to be edited, sent from a 
computer toward a publishing support, means for comparing said 
data one by one with one or several identification criteria in 



the memory, means for copying into the memory of the computer 
the document upon a positive comparison and means for defining 
at least one identification criterion and at least one 
classification criterion of a document upon a negative 
comparison . 

Preferably, the means for analyzing the data can be 
constituted by analysis means and transcoding the signals 
emitted from the computer toward the publishing support. 

When the comparison is negative, means for defining 
new identification criteria are triggered, these definition 
means comprising copying means (copier-slider-gluer) permit- 
ting transfer toward a memory of a portion of the document 
both as to its text and as to the parameters defining its 
place in the document, said definition means also comprising 
means for writing data permitting the user to denominate a 
type of document associated with these identification criteria 
for example. The newly-defined identification criteria are 
kept in the memory and permit an identification, a classifica- 
tion and an automatic filing of a document upon ultimate 
publishing . 

Preferably, the device for practicing the process 
according to the invention is associated with the printing 
control of a printer. There can therefore be selected the 
positioning of the device relative to the publishing support. 
Thus, among several printers connected to the computer, there 
is selected one or several of these printers to work with the 
device according to the invention and only the prints sent 




toward this printer will be identified, classified and filed. 
Any other publication support (fax, internet, etc.) can be 
associated with the printers and hence with the device 
accordi^ to the invention. 

There will now be described the invention in greater 
detail with reference to the drawing in which the single 
figure shows schematically the different steps of the process 
of classification and automatic filing according to the 
invention . 

During sending a document to be published, from a 

y i 

Hi computer (PC) toward a publishing support such as a printer 1, 

4i the identification of the document to be published is carried 

4* out by analyzing the data contained in said document and 

rj comparing them one by one to one or several identification 

rj 15 criteria. 

£5 These identification criteria permit characterizing 

=? a document both by its content and by the position of the data 

in said document. Thus, an invoice could be characterized by 
the identification criteria such as the term "invoice" at 15 
2 0 cm from the left edge and at 6 cm from the upper edge of the 

document, the term "date" at 15 cm from the left edge and 8 cm 
from the upper edge of the document . 

For a letter, one of the identification criteria 
could correspond to the range of position in the letter 
2 5 corresponding to the address and in which is recognized a 

series of five digits (postal code) for example. 
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When these two identification criteria are recog- 
nized in the document to be published, the classification 
criteria are noted, such as the name of the client and this 
document is copied into the memory of the computer according 
to a classification including the documents of the same type 
such as "invoice" and in a classification connected to a 
client for example within the classification "invoice". The 
combination of identification criteria and of classification 
criteria refine the classification. 

As a function of the classification criteria 
defining a classification, the document can be accessible by 
several search routes . 

When the document to be published does not respond 
to any of the identification criteria, there is the possibili- 
ty of defining new identification criteria and new classifica- 
tion criteria to define in the memory a corresponding classi- 
fication so as to permit its copying and its automatic filing 
according to these new criteria and the automatic filing of 
further documents having the same criterion which will be 
subsequently published. 
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